Intro

This notebook is written to investigate how DonorsChoose donations might increase dollar-per-student amount in lower income schools in order to achieve equity with high income schools.

Definitions of low and high income schools are based on per pupil expenditure released at the school level and made available on each state’s department of education website. The specifics of the high/low split were done by Brian Heseung Kim, but briefly, we sorted the schools by per pupil expenditure and took the top and bottom quintiles for each state. Of note, we split New York into schools within New York City and without, due to the increased costs (and thus expenditures) in New York City.

We operationalized the DonorsChoose donations by summing the total donations from every funded project across each school and then dividing by either the total enrolled students or by the number of students reached, as defined in the DonorsChoose data. The total donation value is denoted total_funding_by_donation, the total donated / enrolled is denoted perpupil_donation and total donated / reached is perpupil_donation in the following scripts.

After asking Mohammad about the Funding Status variable, I decided to exclude it from the analyses, since we do not know where that money ended up, and DonorsChoose has done different things with that quanity during different years.

Data Cleaning

The data had already been cleaned in the following ways by Brian Kim:

  1. Charter schools are excluded across the board to make our data more comparable across states
  2. New York State and New York City are split into separate “state” values (so use that to group_by later)
  3. bottom_group (an indicator for schools in the bottom 20th percentile and below) was created
  4. top_group (an indicator for schools in the top 80th percentile and above) was created
  5. Number of teachers is added in for each school

Additionally, I did the following:

  1. Filtered the DonorsChoose data to the following states/groups: Virginia, Illinois, Texas, Massachusetts, North Carolina, Florida, Georgia, New York State, New York City
  2. Filtered the data to only include projects, donations, and expenditures between 2018-08-01 and 2019-06-15
  3. Merged the DonorsChoose and cleaned expenditure data
  4. Created the total_funding_by_donation variable
  5. Checked for missingness. There were schools in the expenditure data that was missing from the DonorsChoose data, and vice versa. Additionally, we were missing some enrollment data for 44 schools were we had project, donation, and per pupil expenditure data.
perpupil n percent
FALSE 44366 1
total_funding_by_donation n percent
FALSE 44366 1
Project.Donation.Total.Amount n percent
FALSE 44366 1
enrollment n percent
FALSE 44322 0.9990082
TRUE 44 0.0009918
School.Nces.ID school_name_nces
360013504597 HOSPITAL SCHOOLS
360009605960 EAST FLATBUSH COMMUNITY RESEARCH SCHOOL

Looking into the missing enrollment further, it seems that the two schools missing enrollment data are 1) a speciality school that students only attend if they are hospitalized, and 2) East Flatbush Community and Research School apparently merged with another school in 2017-2018. Thus, I decided to exclude them.

DonorsChoose Donations ~ High/Low Expenditures

As a first pass, I wanted to see how the DonorsChoose donations differ between schools that have large per student expenditures versus schools that have low per student expenditures. Thus, the first plot shows violin plots for each state, broken down between high and low expenditure schools. Furthermore, I decided to log transform the expenditure data to better visually compare the distributions between high and low exp. schools.

There do not seem to be any locations in which lower income students are receiving more donations per student than their higher income counterparts. Most states seems to have fairly equal donation amounts across the two groups, though some states, such as Georgia, seem to have the reverse trend: more DonorsChoose money is going to the higher income schools than the lower income schools.

I am wondering if this due to more students in the lower income schools. Since this measure divides by enrolled students, it is possible that lower income schools could be getting equal or slightly more in donations, but then the per pupil measure washes this affect out. To address this, I tried to modify our measure by looking at the total donated amount / the number of students reached. We were cautioned about using the the students reached variable, since it was set by the teachers and is perhaps inflated. But I thought it was worth a shot!

This seems to soften the trend for higher income schools to recieve more DonorsChoose money. Since we do not have 100% faith in that variable, however, I also wanted to look at total funding amount, just plotted next to the enrollment numbers.

From this, it looks like that across the board the lower income schools have more students (not surprising). Unfortunately, even in lump sump donations the higher income schools seem to be getting more DonorsChoose dollars

Amount Requested

So why might that be? It could be that the lower income schools are requesting less money? Or are less likely to be funded.

Schools do seem to be across the board requesting a similar amount of money (which means lower income schools are requesting less money per student).

Unique Projects

These plots show little evidence that projects from lower funded schools are less likely to be funded, nor more likely to expire, with the exception of a slight bias in Massachusetts. I think this points to the problem being that lower income schools are asking for less money per pupil, or for an equal number of donations overall as their higher income counterparts.